Improvements in linear transform based speaker adaptation
نویسندگان
چکیده
This paper presents three forms of linear transform based speaker adaptation that can give better performance than standard maximum likelihood linear regression (MLLR) adaptation. For unsupervised adaptation, a lattice-based technique is introduced which is compared to MLLR using confidence scores. For supervised adaptation, estimation of the adaptation matrices using the maximum mutual information criterion is discussed which leads to the MMILR approach. Recognition experiments show that lattice MLLR can reduce word error rates on a Switchboard task by 1.4% absolute. For recognition of non-native speech from the Wall Street Journal database, a reduction in word error rate of 10-16% relative was obtained using MMILR compared to standard MLLR.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker adaptation based on nonlinear spectral transform for speech recognition
This paper proposes a speaker adaptation technique using a nonlinear spectral transform based on GMMs. One of the most popular forms of speaker adaptation is based on linear transforms, e.g., MLLR. Although MLLR uses multiple transforms according to regression classes, only a single linear transform is applied to each state. The proposed method performs nonlinear speaker adaptation based on a n...
متن کاملMLLR-like speaker adaptation based on linearization of VTLN with MFCC features
In this paper, an MLLR-like adaptation approach is proposed whereby the transformation of the means is performed deterministically based on linearization of VTLN. Biases and adaptation of the variances are estimated statistically by the EM algorithm. In the discrete frequency domain, we show that under certain approximations, frequency warping with Mel-£lterbank-based MFCCs equals a linear tran...
متن کاملImprovement of MLLR Speaker Adaptation Using a Novel Method
This paper presents a technical speaker adaptation method called WMLLR, which is based on maximum likelihood linear regression (MLLR). In MLLR, a linear regression-based transform which adapted the HMM mean vectors was calculated to maximize the likelihood of adaptation data. In this paper, the prior knowledge of the initial model is adequately incorporated into the adaptation. A series of spea...
متن کامل